Multiple data structure discovery through global optimisation, meta clustering and consensus methods

نویسندگان

  • Ida Bifulco
  • Carmine Fedullo
  • Francesco Napolitano
  • Giancarlo Raiconi
  • Roberto Tagliaferri
چکیده

When dealing with real data, clustering becomes a very complex problem, usually admitting many reasonable solutions. Moreover, even if completely different, such solutions can appear almost equivalent from the point of view of classical quality measures such as the distortion value. This implies that blind optimisation techniques alone are prone to discard qualitatively interesting solutions. In this work we propose a systematic approach to clustering, including the generation of a number of good solutions through global optimisation, the analysis of such solutions through meta clustering and the final construction of a small set of solutions through consensus clustering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Consensus Clustering + Meta Clustering = Multiple Consensus Clustering

Consensus clustering and meta clustering are two important extensions of the classical clustering problem. Given a set of input clusterings of a given dataset, consensus clustering aims to find a single final clustering which is a better fit in some sense than the existing clusterings, and meta clustering aims to group similar input clusterings together so that users only need to examine a smal...

متن کامل

Multiple Clustering Views from Multiple Uncertain Experts

Expert input can improve clustering performance. In today’s collaborative environment, the availability of crowdsourced multiple expert input is becoming common. Given multiple experts’ inputs, most existing approaches can only discover one clustering structure. However, data is multi-faceted by nature and can be clustered in different ways (also known as views). In an exploratory analysis prob...

متن کامل

Novel consensus quantitative structure-retention relationship method in prediction of pesticides retention time in nano-LC

In this study, quantitative structure-retention relationship (QSRR) methodology employed for modeling of the retention times of 16 banned pesticides in nano-liquid chromatography (nano-LC) column. Genetic algorithm-multiple linear regression (GA-MLR) method employed for developing global and consensus QSRR models. The best global GA-MLR model was established by adjusting GA parameters. Three de...

متن کامل

خوشه‌بندی خودکار داده‌ها با بهره‌گیری از الگوریتم رقابت استعماری بهبودیافته

Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optim...

متن کامل

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJKESDP

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2009